Careers
←Job Openings
Job Description
- Develop the Big Data applications by following Agile Software development life cycle for fast and efficient progress.
- Study various upcoming tools and work on proof of concepts for leveraging those applications in the current system.
- Write software to ingest data into Hadoop and write scalable and maintainable Extract, Transformation and Load jobs.
- Create Scala/Spark jobs for data transformation and aggregation with focus on functional programming paradigm.
- Build distributed, reliable and scalable data pipelines to ingest and process data in real-time. Deal with fetching impression streams, transaction behaviors, clickstream data and other unstructured data.
- Load the data into Spark RDD and do in memory data Computation to generate the Output response.
- Develop shell scripts and automated data management from end to end integration work.
- Develop Oozie workflow for scheduling and orchestrating the ETL process and worked on Oozie workflow engine for job scheduling.
- Created environment to access Loaded Data via spark SQL, through JDBC&ODBC (via Spark Thrift Server).
- Developed real time data ingestion/ analysis using Kafka/ Spark-streaming.
- Design and Implementation of Backup and Disaster Recovery strategy based out of Cloudera BDR utility for
Batch applications and Kafka mirror maker for real-time streaming applications.
- Implement Hadoop Security like Kerberos, Cloudera Key Trustee Server and Key Trustee Management Systems.
- Use Hive join queries to join multiple tables of a source system and load them to Elastic search tables.
- Work on Cassandra and Query Grid. Implementing shell scripts to move data from Relational Database to HDFS (Hadoop Distributed File System) and vice versa.
- Optimize and Performance tuning of the cluster by changing the parameters based on the benchmarking results.
- Run complex queries and work on Bucketing, Partitioning, Joins and sub-queries.
- Apply different HDFS formats and structure like Parquet, Avro, etc. to speed up analytics.
- Translate, load and exhibit unrelated data sets in various formats and sources like JSON, text files, Kafka queues, and log data.
- Work on writing complex workflow jobs using Oozie and set up multiple programs scheduler system which helped in managing multiple Hadoop, Hive, Sqoop, Spark jobs.
Required Skills:
- A minimum of bachelor's degree in computer science or equivalent.
- Cloudrea Hadoop(CDH), Cloudera Manager, Informatica Bigdata Edition(BDM), HDFS, Yarn, MapReduce, Hive, Impala, KUDU, Sqoop, Spark, Kafka, HBase, Teradata Studio Express, Teradata, Tableau, Kerberos, Active Directory, Sentry, TLS/SSL, Linux/RHEL, Unix Windows, SBT, Maven, Jenkins, Oracle, MS SQL Server, Shell Scripting, Eclipse IDE, Git, SVN
- Must have strong problem-solving and analytical skills
- Must have the ability to identify complex problems and review related information to develop and evaluate options and implement solutions.
If you are interested in working in a fast-paced, challenging, fun, entrepreneurial environment and would like to have the opportunity of being a part of this fascinating industry, Send resumes. to HSTechnologies LLC, 2801 W Parker Road, Suite #5 Plano, TX - 75023 or email your resume to hr@sbhstech.com.